Observations/Thoughts

Data Discovery

Each new dataset provided requires a simple, quick, but nearly exhaustive data discovery analysis. Visualization is usually the best method for such an endeavor. The goal of the initial data discovery phase would be to:

  • identify outliers or extreme behavior,
  • observe general trends,
  • assess assumptions for statistical modeling, and
  • provide an easy method to collect documentation and defintions of the data.

Sections below show some quick observations, which seemlessly become initial documentation of the dataset. This document required 6 hours of work, but should be the building block of developing a framework for reproducible initial investigation of similar datasets.

Mission President Effect

New Investigators

Clearly a mission president effect is observed, note the color changes and regression lines by mission president for each mission.

However we should confirm that the mission president effect isn’t overly influenced/confounded by the number of missionaries. The plot below confirms that the mission president has a potentially strong effect on the number of new investigators within a mission even after accounting for the number of missionaries.

Baptisms/Confirmations

While new investigators represent the top of the funnel, baptisms/confirmations (the bottom of the funnel) also show a mission president effect.

Sacrament Meeting Attendance

Just to be thorough, while skeptical of a mission president effect on sacrament meeting attendance it appears that such an effect is real.

Data Cleaning/Correcting Opportunities

Potential Outliers

Outliers are possible and we would want to be careful about their effect on predictions/goals. Potential outliers identified include:

  • Mission Q has one week where Baptisms/Confirmations spike to 75 when the next highest week was 16. (Maybe the 75 is supposed to be 15?) Because this field can be verified (compare MSR’s number to any self-reported numbers) I would hesitated to change it here but did so to the previous week’s value of 2.
  • Mission K has an abnormal spike in New Investigators (1137 on 2010-02-11, note the next highest week was 428). This may be real, but if concerned about it’s effect on modeling or otherwise I would consider replacing it with the mean or the previous week’s value. On closer examination this appears to be a “fat finger” typo, likely it should have been reported as 113 or 117 (I chose to change it to 113).
  • Mission J recorded one week where sacrament meeting attendance was 714 when a typical week would be 116. Again it may be worth a quick call or email to the mission president to confirm the blessed nature of such an event.
    • Mission J also experienced a deep drop in new investigators the same week (2011-12-08) as the spike in sacrament meeting attendance. It appears likely those two numbers were incorrectly switched when reported.
  • Mission L had also saw a spike in sacrament meeting attendance, 436 (average attendance is 119). This value was replaced with the previous week’s attendance of 149.

All identified potential outliers were removed for subsequent data processing and analysis. There are likely other outliers that require attention.

Area Average

Knowing that all missions within an area are not homogeneous makes it difficult to make mission-to-mission comparisons. A simple comparison though could be made between each mission and the area average. While one might prefer to compare the mission to it’s historical behavior/average, comparison to the area average accounts for some general or seasonal trending and helps appropriately illustrate mission-to-mission variability.

One may wish to smooth out the area averages (likely using a moving average) to assure that inherent variability doesn’t distract the analyst/viewer from observing the general trend, which was not done here.

Visualizations

Trelliscope

Trelliscope is a R package that like Tableau enables the audience to interact visually with the data. Because trelliscope is built for use in R, we can leverage other important tools to mimic Tableau’s interactivity (in this case a JavaScript vis library called bokeh) and go well beyond Tableau’s limits by moving visually through various slices of the data using features derived directly from the data. The example below should help illustrate the advantages of such an approach, thus enabling the analyst to observe the data in ways not easily obtained using other business intelligence (BI) tools. BI reporting tools, e.g. Tableau and Business Objects, can then be leveraged for customized reporting based on the outcomes/results/observations generated via the interactive analysis.

Conclusions/Suggestions

The four metrics provided in the data were used to calculate additional metrics adjusting for the number of missionaries and comparing the aggregated (area) average. These additional considerations allow for more comparison of a mission’s metrics rather than only make historical comparisons to oneself. Interpretation and summarization of these visualizations requires an additional commitment to truly understand the story of each mission and to set appropriate goals/expectations.

Goals

Elder Ballard said in his April 2017 general conference address:

“Over the years, I have observed that those who accomplish the most in this world are those with a vision for their lives, with goals to keep them focused on their vision and tactical plans for how to achieve them. Knowing where you are going and how you expect to get there can bring meaning, purpose, and accomplishment to life.”

He followed that with:

“Goal setting is essentially beginning with the end in mind. A key to happiness lies in understanding what destinations truly matter—and then spending our time, effort, and attention on the things that constitute a sure way to arrive there.”

and with:

“Experts on goal setting tell us that the simpler and more straightforward a goal is, the more power it will have. When we can reduce a goal to one clear image or one or two powerful and symbolic words, that goal can then become part of us and guide virtually everything we think and do.”

Based on those comments I feel conflicted personally on goal setting for missions. One could produce some sophisticated time-series statistical/machine learning modeling to both predict mission key indicators and to generate goals for each mission. My experience as a data scientist puts me in that camp, realizing I would need to learn more about the overall goals and expectations for mission presidents, missionaries, and their experience.